GPU Procurement Needs To Account For Obsolescence

Image

The rapid advancement of artificial intelligence (AI) has made it a transformative force across industries, central to this evolution are the GPUs (Graphics Processing Unit). These are specialized processors essential for accelerating the parallel computations needed to train and deploy large-scale AI models such as large language models (LLMs) and computer vision systems. Recognizing its critical role, governments and private enterprises around the world are making significant investments in GPU-powered computing infrastructure, viewing it not only as a technological necessity but also as a strategic asset in the global AI race.

The global AI infrastructure market is witnessing explosive growth, with projections estimating it will reach $394 billion by 2030. 1 Compute infrastructure makes up a significant portion of this expansion, fueled by escalating demand for high-performance hardware. In 2024 alone, leading chipmaker NVIDIA alone commanding nearly 80% of the high-performance GPU market generated over $100 billion in revenue from AI-related hardware. 2

However, this aggressive investment trend is not without criticism. Observers warn that the hype surrounding AI may be driving overinvestment in infrastructure that is inherently vulnerable to rapid obsolescence. GPUs, despite their power, typically have a functional lifespan of only 2–5 years. Coupled with the fast-paced evolution of AI technologies, there is a real risk that infrastructure built today may become outdated before it delivers expected returns.


Responding to this momentum, major tech companies have dramatically scaled up their GPU acquisitions. Private players, including Amazon, Google, and Microsoft, are building hyperscale data centers, with global data center capacity expected to grow to 212 GW by 2030. 3 However, this aggressive investment trend is not without criticism. Observers warn that the hype surrounding AI may be driving overinvestment in infrastructure that is inherently vulnerable to rapid obsolescence. 4,5 GPUs, despite their power, typically have a functional lifespan of only 2–5 years. 6 Coupled with the fast-paced evolution of AI technologies, there is a real risk that infrastructure built today may become outdated before it delivers expected returns.

GPUs are being used for AI due to their ability to perform parallel processing, enabling the training of complex neural networks. For instance, training a model like OpenAI’s GPT-4 reportedly required 25,000 GPUs over several months, while China’s DeepSeek R1 model used 2,000 GPUs, demonstrating varying scales of compute needs. 7 The demand for GPUs has surged, with global shipments of data center GPUs reaching 3.76 million units in 2023. 8 Among this meteoric demand for GPUs and the rising geopolitical tensions, governments are leveraging compute infrastructure as a regulatory and geostrategic tool. The United States, home to the largest technological companies including NVIDIA, the largest shareholder in the GPU production and sales market. Leveraging on the presence of tech companies in the USA, the government has imposed export controls on advanced GPUs to China, citing national security concerns, while China is investing heavily in domestic chip development. 9

In this article, we examine the role of GPUs and the obsolescence risks associated with such investments and their broader implications for India’s AI ambitions.


India’s AI Ambitions

India’s digital economy is thriving, with 1.25 billion internet users, 750 billion minutes of mobile app usage, and 17 GB of data consumption per user per month. The IT and BPM sector contributes 7.5% to India’s GDP, projected to reach 10% by FY25, and the AI market is expected to grow at a 25–35% CAGR, reaching $28.8 billion by 2025. India’s data center capacity, at 950–977 MW in 2023, is set to double to 2,000 MW by 2026, positioning it ahead of regional peers like Japan and Singapore. 10 Despite these strengths, India faces challenges in AI competitiveness, including limited access to high-end compute, talent shortages, and data gaps. The IndiaAI Mission seeks to address these by building a robust AI ecosystem, with a focus on compute infrastructure, datasets, and innovation. 11

India, an emerging global power, is striving to establish itself as a leader in artificial intelligence through the IndiaAI Mission, launched in 2024 with a five-year budget of ₹10,372 crore (approximately $1.3 billion). 12 The mission aims to democratize access to high-performance computing resources for startups, academic institutions, and researchers. This initiative marks a significant step in strengthening India’s AI capabilities and global competitiveness. However, the rapid pace of technological advancement poses a key challenge. The risk of hardware obsolescence threatens the long-term viability of these capital-intensive investments. In the first phase, the mission plans to deploy over 18,000 AI compute units via a public-private partnership (PPP) model, with an additional 15,000 units targeted in the next phase. 13

Under the ₹10,372 crore IndiaAI Mission, approximately 40% of the investment is earmarked for provisioning high-performance GPUs. To achieve this at scale, the government has opted for a public-private partnership (PPP) model. Rather than directly building and managing GPU infrastructure, the government is leveraging private data center operators who will host and maintain the compute units. Access to these GPU resources will be facilitated through a centralized application portal, with pricing set at a nominal rate of roughly $1 per GPU-hour, significantly lower than the global average of $2 to $3.14 This approach offers dual advantages: it improves operational efficiency by involving experienced private players and circumvents the inefficiencies often associated with public sector projects in India. Importantly, this model also shifts the ownership and lifecycle management of hardware assets such as acquisition, maintenance, and eventual disposal onto private partners. While this reduces the financial and operational burden on the government, it introduces business risk for private entities, especially around the fast-paced obsolescence of AI hardware.


image

Issues of technological obsolescence:

While investments in compute infrastructure are essential for scaling AI capabilities, they carry significant risks as a result of technological obsolescence. Several interrelated factors threaten the economic viability of such capital-intensive spending, especially as hardware lifecycles shrink and the AI ecosystem rapidly evolves.

Short GPU Lifespan

GPUs may have a functional lifespan of 2–5 years, depending on utilization intensity and operating conditions. While precise official data is limited, industry observations suggest that GPUs running at consistently high utilization often last only 2–3 years before experiencing significant performance degradation. High-performance GPUs such as NVIDIA’s A100 and H100 are engineered for compute-intensive workloads, but their reliability can diminish over time due to sustained heavy usage. One of the key factors contributing to this degradation is thermal stress. With chips like the H100 consuming up to 700 watts of power, significant heat is generated during operation. Prolonged exposure to elevated temperatures accelerates wear on internal components, leading to reduced performance, higher error rates, and ultimately, a shortened operational lifespan. Thermal-induced degradation is therefore a critical concern, especially in large-scale AI deployments where GPUs are run at high utilization for extended periods.


Additionally, manufacturers often phase out support for older models. For example, NVIDIA declared its A100 series end-of-life (EOL) in February 2024, merely four years after its 2020 release. 15 EOL status halts software updates, security patches, and technical support, diminishing utility for demanding, state-of-the-art applications. This rapid turnover cycle places public and private investments at risk, as outdated GPUs may be unable to support emerging, next-generation AI models.


Rapid Technological Evolution

The AI hardware landscape evolves rapidly, with new architectures like NVIDIA’s B200 and AMD’s MI300X offering significant performance improvements. For instance, the H100 GPU is up to 3x faster than the A100 for certain AI workloads. 16 Emerging technologies, such as quantum computing and specialized AI accelerators (e.g., Google’s TPUs), could further disrupt GPU dominance. The rise of efficient models, like DeepSeek’s R1, which require fewer GPUs, suggests that compute-intensive approaches may become less critical, increasing the risk of over investment in current-generation infrastructure. Further, future estimates also predict AI accelerators based on ASIC or application-specific integrated circuits to play a major role in the inference workloads. This will be a major shift from the current norm of using CPU-GPU combination for both training and inference. 17

image

Financial Risks

As countries and corporations globally pour billions into building AI infrastructure especially in procuring high-performance GPUs, the financial and strategic risks of such capital-intensive investments are becoming increasingly evident. A primary concern is the rapid obsolescence of hardware. AI processors evolve at an extraordinary pace, with newer architectures like NVIDIA’s Hopper and Blackwell series quickly supplanting older models such as the A100. This rapid turnover shortens the effective lifespan of major infrastructure projects.

India's own example is telling. Under the IndiaAI Mission, ₹4,563 crore approximately 44% of the total budget has been earmarked for GPU procurement. 18 However, concerns have emerged over the inclusion of 176 NVIDIA A100 GPUs in its 18,693-unit plan, as the A100 has already been declared End-of-Life (EOL) as of February 2024. 15 This raises questions about long-term cost-effectiveness, especially as subsidies are being directed toward hardware already on the brink of obsolescence. If infrastructure becomes outdated before delivering expected returns, stakeholders face financial losses and a reduced ability to compete in the global AI race.

On the corporate front, Big Tech players have made enormous investments in GPU infrastructure. Microsoft, for instance, has acquired 485,000 NVIDIA Hopper GPUs (primarily H100s) since 2022, spending an estimated $12.1–$14.6 billion. 19 Meta followed with 224,000 H100 GPUs and now claims compute capacity equivalent to 600,000 H100s contributing to a broader $10.5 billion AI infrastructure budget. 20 Google has secured 196,000 Hopper GPUs in 2024, up from 60,000 in 2022, and has committed another $10 billion to acquire 400,000 next-generation GB200 GPUs in 2025 as part of its $75 billion CapEx plan heavily focused on AI data centers. 21 While these companies are betting big on AI, it's important to note that profitability remains elusive across much of the sector. Leading AI firms, especially those developing generative AI tools have yet to turn a profit. OpenAI, the creator of ChatGPT, reported a loss of $5 billion against $3.7 billion in revenue, underscoring the uncertain financial returns in the short term despite massive infrastructure investment. 22

Beyond India, similar risks exist for other nations and organizations investing in GPU-centric AI infrastructure without well-defined upgrade paths, sustainability frameworks, or lifecycle planning. Outdated hardware may quickly become incompatible with new AI models, thereby diminishing the utility and return on massive capital expenditures. Additionally, overreliance on a single vendor (currently NVIDIA), which holds nearly 80% of the high-performance GPU market. This introduces supply chain fragility, heightens exposure to pricing shocks, and risks geopolitical disruption, particularly amid rising export restrictions and competition for semiconductor resources.


Strategic Implications

Nations failing to align their AI compute strategies with global technology trajectories risk falling behind in both capability and sovereignty. Without robust planning for upgradability, maintenance, and diversification, these investments may entrench dependence on a narrow set of vendors and outdated technologies. The absence of domestic chip manufacturing capacity, limited open-source AI engagement, and the lack of hardware diversity further amplify these vulnerabilities.

To mitigate these risks, long-term strategies must go beyond one-time hardware acquisitions. Countries should prioritize lifecycle planning, foster local innovation ecosystems, and collaborate on international standards and supply chain resilience. 23 Building flexible, scalable compute infrastructures complemented by strategic investments in R&D, open innovation, and domestic capability development will be essential to navigating the fast-moving AI landscape without succumbing to premature obsolescence


Way Forward:

As the global race to lead in artificial intelligence intensifies, countries are making significant investments in AI infrastructure, particularly in high-performance computing resources such as GPUs. However, the rapid pace of innovation in AI models and hardware poses a serious risk of obsolescence. Without proactive planning, today’s state-of-the-art systems may quickly become outdated, undermining the long-term return on public and private capital. Addressing this risk requires not only sound procurement policies but also system-level design thinking that embeds flexibility, scalability, and upgradability into AI infrastructure from the outset.

India’s ambitious push to establish itself as a global AI hub through the IndiaAI Mission rests fundamentally on the creation of a robust and future-ready AI compute infrastructure. However, the rapid evolution of hardware, particularly GPUs, poses a significant risk of obsolescence, threatening to undermine the long-term utility and financial sustainability of the infrastructure being developed. With high-end GPUs such as NVIDIA’s A100 already being superseded by newer models like the H100, H200, and Blackwell chips within a typical lifecycle of just 2–5 years, there is a real danger that major investments could soon become stranded assets unless accompanied by a strategic and forward-looking obsolescence mitigation framework.

Equally important is the need to maximize the utility of existing infrastructure. This can be achieved by enhancing utilization efficiency through better scheduling and resource sharing, extending GPU lifespan via redeployment for less demanding tasks, and exploring leasing or refurbishment models for tiered access across institutions. The proposed IndiaAI compute access portal could serve as a national platform to coordinate such efforts, ensuring democratized and judicious use of high-performance computing resources. Software-level innovation also holds immense potential in mitigating hardware obsolescence. By promoting algorithmic efficiency through techniques like quantization, pruning, and low-precision model training, India can enable contemporary AI workloads to run on older GPUs, thus extending their useful life. Supporting open-source communities and startups working on compute-efficient models can further amplify this impact. Simultaneously, designing data centers that are modular and scalable, capable of supporting upgrades without complete overhauls will provide the infrastructure with the agility it needs to keep pace with fast-moving global trends.

To support this multifaceted transformation, India’s policy ecosystem must align with the long-term goals of adaptability and resilience. Fiscal incentives for sustainable and upgrade-friendly designs, targeted R&D funding under the Digital India budget, and alignment with global best practices through platforms like GPAI can collectively reinforce the foundation of a vendor-agnostic and future-proof AI compute ecosystem. Furthermore, the establishment of an institutional mechanism to monitor emerging hardware trends, conduct annual infrastructure audits, and recommend timely course corrections would ensure that the country remains technologically competitive and strategically informed.

As India prepares to deploy more investment under its IndiaAI Mission, the focus must shift from merely scaling capacity to sustaining it wisely. Prioritizing obsolescence mitigation at every stage, starting from procurement, deployment, utilization, and disposal, will not only enhance returns on public investment but also ensure continuity in research, innovation, and access. Ultimately, India’s leadership in AI will be measured not just by the quantum of its infrastructure, but by the quality of its foresight and the resilience of its systems. By embedding adaptability and lifecycle consciousness into the very architecture of the IndiaAI initiative, the country can lay the groundwork for sovereign, scalable, and sustainable AI leadership in the years to come.


References:

1. https://www.marketsandmarkets.com/Market-Reports/ai-infrastructure-market-38254348.html
2. https://www.barrons.com/articles/nvidia-stock-ai-chip-revenue-98dfccf1
3. https://www.mckinsey.com/industries/technology-media-and-telecommunications/our-insights/ai-power-expanding-data-center-capacity-to-meet-growing-demand
4. https://builtin.com/artificial-intelligence/ai-bubble
5. https://timesofindia.indiatimes.com/technology/tech-news/how-nvidia-ceo-jensen-huangs-joke-about-blackwell-gpus-may-hurt-amazon-google-and-meta/articleshow/119370907.cms
6. https://massedcompute.com/faq-answers/?question=What%20is%20the%20typical%20lifespan%20of%20an%20NVIDIA%20data%20center%20GPU
7. https://cio.economictimes.indiatimes.com/news/corporate-news/global-ai-race-india-to-build-own-foundational-models-18693-gpus-to-power-compute-facility/117790662
8. https://www.hpcwire.com/2024/06/10/nvidia-shipped-3-76-million-data-center-gpus-in-2023-according-to-study/
9. https://economictimes.indiatimes.com/tech/artificial-intelligence/proposed-us-restriction-on-ai-chip-export-threatens-indias-ai-hardware-plans-iesa/articleshow/117296627.cms?from=mdr
10. https://www.ibef.org/blogs/booming-data-centre-growth-in-india
11. https://indiaai.gov.in/
12. https://www.moneycontrol.com/technology/indiaai-mission-meity-invites-applications-from-industry-for-providing-ai-services-on-cloud-article-12799606.html
13. https://economictimes.indiatimes.com/tech/technology/govt-gets-bids-offering-18000-gpus-in-round-2-of-indiaai-mission-tender/articleshow/121118580.cms?from=mdr
14. https://economictimes.indiatimes.com/tech/technology/india-ai-mission-gpu-hourly-pricing-aggressively-low/articleshow/117799835.cms?from=mdr
15. https://www.firstpost.com/tech/indian-startups-being-offered-last-gen-end-of-life-gpus-in-indiaai-mission-report-13864326.html
16. https://wccftech.com/amd-mi300x-3x-faster-nvidia-h100-llm-inference-ai-benchmarks-competitive-pricing/
17. https://www.mckinsey.com/industries/semiconductors/our-insights/generative-ai-the-next-s-curve-for-the-semiconductor-industry
18. https://www.moneycontrol.com/technology/indiaai-mission-selects-10-firms-to-bid-for-10-000-gpu-mega-tender-article-12916413.html
19. https://wireunwired.com/microsoft-emerges-as-the-largest-buyer-of-nvidia-hopper-gpus-in-2024-bought-485000-gpus/
20. https://www.datacenterdynamics.com/en/news/meta-to-operate-600000-gpus-by-year-end/
21. https://www.datacenterdynamics.com/en/news/microsoft-bought-twice-as-many-nvidia-hopper-gpus-as-other-big-tech-companies-report/
22. https://www.business-standard.com/world-news/insane-thing-sam-altman-says-openai-losing-money-on-chatgpt-subscription-125010700954_1.html
23. https://www.lek.com/insights/tmt/us/ei/navigating-chip-shortages-and-nvidia-frenzy
Night
Day